Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support in core tools for running qualification on Dataproc GKE #613

Merged
merged 5 commits into from
Oct 11, 2023

Conversation

parthosa
Copy link
Collaborator

@parthosa parthosa commented Oct 9, 2023

Contributes to #253. This PR adds support for Dataproc GKE platform in core tools.

Changes:

  • Operator scores are generated using the validate_qualification_estimates.py script.
  • Workload Information - Dataproc on GKE:
    • Controller: 1 X n1-standard-16
    • Driver: 1 X n1-standard-16
    • Executors: 8 X n1-standard-32, (with 8 X T4 for GPU runs)

Metrics on operator scores from validation script

==================================================
            Duration Error Metrics
==================================================
Average duration error (seconds)  = 577.53
Median duration error (seconds)   = 577.53
Min duration error (seconds)      = 577.53
Max duration error (seconds)      = 577.53
Average duration error (diff pct) = 9.85
Median duration error (diff pct)  = 9.85
Max duration error (diff pct)     = 9.85
Average duration error (diff sec) = 577.53
Median duration error (diff sec)  = 577.53
Max duration error (diff sec)     = 577.53
Average duration error (abs pct)  = 9.85
Median duration error (abs pct)   = 9.85
Max duration error (abs pct)      = 9.85
==================================================
            Speedup Error Metrics
==================================================
Average speedup error (diff)      = -0.36
Median speedup error (diff)       = -0.36
Min speedup error (diff)          = -0.36
Max speedup error (diff)          = -0.36
Average speedup error (diff pct)  = -10.94
Median speedup error (diff pct    = -10.94
Max speedup error (diff pct)      = -10.94
Average speedup error (abs diff)  = 0.36
Median speedup error (abs diff)   = 0.36
Max speedup error (abs diff)      = 0.36
Average speedup error (abs pct)   = 10.94
Median speedup error (abs pct)    = 10.94
Max speedup error (abs pct)         = 10.94

@parthosa parthosa added the core_tools Scope the core module (scala) label Oct 9, 2023
@parthosa parthosa self-assigned this Oct 9, 2023
Signed-off-by: Partho Sarthi <[email protected]>
Signed-off-by: Partho Sarthi <[email protected]>
@parthosa parthosa marked this pull request as ready for review October 10, 2023 21:37
Signed-off-by: Partho Sarthi <[email protected]>
Copy link
Collaborator

@nartal1 nartal1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @parthosa !

Copy link
Collaborator

@cindyyuanjiang cindyyuanjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@parthosa parthosa merged commit 4dec452 into NVIDIA:dev Oct 11, 2023
8 checks passed
@parthosa parthosa deleted the spark-rapids-tools-253-core-tools branch October 11, 2023 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core_tools Scope the core module (scala)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants